Lexique 2: a new French lexical database.

نویسندگان

  • Boris New
  • Christophe Pallier
  • Marc Brysbaert
  • Ludovic Ferrand
چکیده

In this article, we present a new lexical database for French: Lexique. In addition to classical word information such as gender, number, and grammatical category, Lexique includes a series of interesting new characteristics. First, word frequencies are based on two cues: a contemporary corpus of texts and the number of Web pages containing the word. Second, the database is split into a graphemic table with all the relevant frequencies, a table structured around lemmas (particularly interesting for the study of the inflectional family), and a table about surface frequency cues. Third, Lexique is distributed under a GNU-like license, allowing people to contribute to it. Finally, a metasearch engine, Open Lexique, has been developed so that new databases can be added very easily to the existing ones. Lexique can either be downloaded or interrogated freely from http://www.lexique.org.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MEGALEX: A megastudy of visual and auditory word recognition.

Using the megastudy approach, we report a new database (MEGALEX) of visual and auditory lexical decision times and accuracy rates for tens of thousands of words. We collected visual lexical decision data for 28,466 French words and the same number of pseudowords, and auditory lexical decision data for 17,876 French words and the same number of pseudowords (synthesized tokens were used for the a...

متن کامل

Principles of Lexical Network Systemic Modeling (Principes de modélisation systémique des réseaux lexicaux) [in French]

We introduce a new approach for manually constructing broad-coverage lexical ressources based on a specific type of lexical network called lexical system. Drawing on experience gained from the construction of the French Lexical Network (fr-LN), we begin by formally characterizing lexical systems as “small-world” graphs of lexical units that are primarily organized around the system of Meaning-T...

متن کامل

Lexical acquisition from corpora: the case of subcategorization frames in French

We present in this paper a method to automatically acquire a syntactic lexicon of subcategorization frames for French verbs directly from large corpora. The method is evaluated against existing lexical resources: we show that our system is capable of producing new frames that were not previously registered. Lastly, we show that it is possible to induce lexico-semantic classes « à la Levin » (19...

متن کامل

Towards the establishment of a LMF-based Wolof language lexicon (Vers la mise en place d'un lexique basé sur LMF pour la langue wolof) [in French]

Wolof is the most widely spoken language in Senegal, but its effective use in education and training requires the development of NLP tools which is based on lexicon. Unfortunately such a lexicon does not exist and its implementation, requires prior linguistic study of the data structure of the language. However, such work has been done for the development of a lexical database for the Wolof lan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc

دوره 36 3  شماره 

صفحات  -

تاریخ انتشار 2004